Rank | Count | Beginning |
---|---|---|
82834 | 6295 | U |
35181 | 2498 | Na |
61391 | 1156 | Prema |
1 | 1110 | • |
37729 | 1029 | Nakon |
95766 | 1027 | Za |
48300 | 924 | On |
25961 | 785 | Kako |
79911 | 740 | To |
46319 | 714 | Od |
18739 | 583 | I |
95944 | 527 | Zadnja |
22359 | 520 | Iz |
69300 | 475 | S |
32113 | 467 | Međutim, |
53397 | 444 | Ovo |
77717 | 424 | Također, |
50648 | 423 | Osim |
51998 | 416 | Ovaj |
80462 | 409 | Tokom |
98679 | 406 | Zbog |
9313 | 401 | Da, |
55359 | 387 | Po |
18748 | 385 | Iako |
77533 | 381 | Tako |
57730 | 380 | Pored |
36323 | 373 | Naime, |
51884 | 373 | Ova |
26915 | 351 | Kao |
25452 | 334 | Kada |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV